Converting Romanized Persian to the Arabic Writing System

نویسندگان

  • Jalal Maleki
  • Lars Ahrenberg
چکیده

This paper describes a syllabification based conversion method for converting romanized Persian text to the traditional Arabic-based writing system. The system is implemented in Xerox XFST and relies on rule based conversion of words rather than using morphological analysis. The paper presents a brief evaluation of the accuracy of the transcriptions generated by the method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Converting Romanized Persian to the Arabic Writing Systems

This paper describes a syllabification based conversion method for converting romanized Persian text to the traditional Arabic-based writing system. The system is implemented in Xerox XFST and relies on rule based conversion of words rather than using morphological analysis. The paper presents a brief evaluation of the accuracy of the transcriptions generated by the method.

متن کامل

Applying Finite State Morphology to Conversion Between Roman and Perso-Arabic Writing Systems

This paper presents a method for converting back and forth between the Perso-Arabic and a Romanized writing systems for Persian. Given a word in one writing system, we use finite state transducers to generate morphological analysis for the word that is subsequently used to regenerate the orthography of the word in the other writing system. The system has been implemented in XFST and LEXC.

متن کامل

Tajik-Farsi Persian Transliteration Using Statistical Machine Translation

Tajik Persian is a dialect of Persian spoken primarily in Tajikistan and written with a modified Cyrillic alphabet. Iranian Persian, or Farsi, as it is natively called, is the lingua franca of Iran and is written with the Persian alphabet, a modified Arabic script. Although the spoken versions of Tajik and Farsi are mutually intelligible to educated speakers of both languages, the difference be...

متن کامل

Automatic Transliteration of Romanized Dialectal Arabic

In this paper, we address the problem of converting Dialectal Arabic (DA) text that is written in the Latin script (called Arabizi) into Arabic script following the CODA convention for DA orthography. The presented system uses a finite state transducer trained at the character level to generate all possible transliterations for the input Arabizi words. We then filter the generated list using a ...

متن کامل

Syllable Based Transcription of English Words into Perso-Arabic Writing System

This paper presents a rule-based method for transcription of English words into the PersoArabic orthography. The method relies on the phonetic representation of English words such as the CMU pronunciation dictionary. Some of the challenging problems are the context-based vowel representation in the Perso-Arabic writing system and the mismatch between the syllabic structures of English and Persi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008